Mate choice for body size leads to size assortative mating in the Ryukyu Scops Owl Otus elegans

Abstract Understanding evolutionary phenomena that involve size assortative mating requires elucidating the generating mechanisms on which assortment is based. Although various mechanisms have been suggested, their relative importance may differ across taxonomic groups. Males selecting for large, fecund females combined with the dominance of large males in the competition for females has been suggested as a major mechanism in specific groups. However, raptors do not appear to conform to this, because the selection for smallness among males (assumed in a theory of reversed sexual size dimorphism) and the selection for largeness among males (assumed in the theory of size assortative mating) are in opposite directions. We studied the assortative mating during a long‐term study of the Ryukyu Scops Owls Otus elegans interpositus. Significant assortative mating was found for culmen length (from the base to the tip of the bill) and wing length (from the bend of the wing to the tip of the longest primary). Statistical control of the spatial and temporal accessibility of potential mates did not affect the assortment. Males with short wings had slightly higher fitness components than those with long wings, and females settling early tended to have small wings. Considering that early‐settling females can preferentially choose their mates, these results suggest that smaller females have an advantage when choosing smaller males with good reproductive performance. Improved flying and hunting ability of smaller individuals may be the background of choosing smaller individuals. We propose that, not passive process like similarity between individuals and their potential mates, but active mate choice for small individuals is an explanation for the assortative mating in this owl.


| INTRODUC TI ON
Mated-pair members often share phenotypic traits, indicative of assortative mating (Jiang et al., 2013). These traits include characteristics such as: coloration, body size, aggressiveness, genotype, metabolic state, and intelligence (reviewed in Jiang et al., 2013;Luo, 2017;Wang et al., 2019). Because assortative mating can be an incipient process of speciation, or assumed to be a prerequisite of speciation models (Bolnick & Fitzpatrick, 2007;Coyne & Orr, 2004;Elmer et al., 2009), and because it can be an outcome of sexual selection (Crespi, 1989;Wang et al., 2019), consequences of assortative mating have great significance for evolutionary biology. To accurately understand evolutionary phenomena involving assortative mating, the generating mechanisms of the assortments are also of importance (Galipaud et al., 2013). Here, we will use the term "assortative mating" to mean the circumstance where the phenotypic correlation between members of mated-pair can be observed, irrespective of the mechanisms on which the correlation occurs, for simplicity. Focusing on size assortative mating among avian species, Wang et al. (2019) classified the mechanisms into three categories: (1) like meets like, (2) become alike, and (3) mate choice. We will review the three mechanisms in the following three paragraphs.
Mechanism 1: "like meets like" explains the correlation between paired individuals by resemblance between individuals and their potential mates (Wang et al., 2019). For example, potential mates nearby may have similar genotypes and phenotypes (Erlandsson & Rolán-Alvarez, 1998;Indykiewicz et al., 2017). Under such circumstance, members of a pair resemble each other even without preference to similar individuals. Such similarity also arises from temporal separation of individuals (Hendry & Day, 2005). For lifelong monogamous species, potential mates for young recruits into the breeding population are often also young recruits. As young recruits of raptors often breed later than adults in a breeding season (Warkentin et al., 1992), this leads to assortment for age. If body size differs with age, then correlation of body size may occur as a consequence of age-related temporal assortment (Wagner, 1999).
Mechanism 2: "become alike" explains the correlation between paired individuals by sharing the same environmental effect between mates (Wang et al., 2019). For species in which mates share resources such as territory and food, mates may resemble each other because they feed on similar food, use similar habitat, and are affected by similar environmental effects (Class & Brommer, 2018).
Such similarity is expected to find in labile traits such as wing length and body mass.
Mechanism 3: "mate choice" explains the correlation between paired individuals by choice (or preference) by one or both mates (Wang et al., 2019). If the choice is based on the similarity, then positive correlation between the mates occurs. However, even without such a preference for similarity, correlation can arise. For example, if males prefer large females for reasons of their fecundity, and if large males are at an advantage in the competition for large females, then such competition results in pairs of large (competitive) males and large (fecund) females, and their opposites small (uncompetitive) males with small (infecund) females (Crespi, 1989). Hereafter, we call this the competition-based mechanism. Note that the example above assumes only males' preference to females. Therefore, mutual choice is not unnecessary to the correlation among pair members to occur.
Although three mechanisms have been suggested irrespective of taxonomic groups, their relative importance may differ across groups. As previously mentioned, a competition-based mechanism assumes that acquiring large fecund females is an advantage for males. In a review of the mechanisms involved in size assortative mating, Crespi (1989) suggested that such a competition-based mechanism is dominant among arthropods. However, it may be less important for taxonomic groups with small variations in fecundity, or among groups with little correlation between female body size and fecundity (Pincheira-Donoso & Hunt, 2017), since sufficient variation within the target of choice is necessary for such choice to work (Lehmann et al., 2007). In addition, the competition-based mechanism may also be less important among species in which female choice is more important than male-male physical competition for females, as is suggested by research into anuran amphibians (Green, 2019). Considering these, if size assortative mating were to occur in an avian species, what might be the contributory mechanisms? Birds lay far fewer eggs than do arthropods, hence birds seem to have low inter-individual variation in fecundity, and they are traditional subjects for studies of female choice of males.
Assortative mating in birds may arise due to them having a different set of contributing mechanisms from other taxonomic groups (Wang et al., 2019).
Hawks, eagles, falcons, and owls (hereafter called "raptors," for simplicity) offer interesting opportunities for investigating the cause of size assortative mating. Firstly, raptors are long-lived and often show life-long monogamy (König & Weick, 2008;McDonald et al., 2005). Such characteristics call for careful mate choice because it can greatly influence life-time reproductive success (Wojczulanis-Jakubas et al., 2018). Secondly, raptors occur worldwide and vary considerably in body size (Schoenjahn et al., 2020). This facilitates comparative analysis of the relationships between size assortative mating and various factors. Thirdly, female raptors are typically larger than males (reversed sexual size dimorphism: Mueller, 1986;Owens & Hartley, 1998;Krüger, 2005). One major hypothesis explaining the evolution of this dimorphism is the small-male hypothesis, which considers that males are selected to be small thereby improving their agility, maneuverability, and foraging efficiency (Hakkarainen et al., 1996;Krüger, 2005). Intriguingly, this selection for smallness is the exact opposite of the selection for largeness which is assumed in competition-based mechanism for size assortative mating. Furthermore, one of alternative hypotheses explaining the dimorphism relies on the intersexual size difference having been selected to reduce intersexual competition (Krüger, 2005;Pande & Dahanukar, 2012, see also Mueller, 1986 for other alternative explanations of the dimorphism). If this is the case, then dissimilarity in the size of mates (disassortative mating) rather than similarity (assortative mating) seems to be preferred. Based on these considerations, the occurrence of size assortative mating per se among raptors is interesting since it indicates coexistence of two selection pressures in different directions. Therefore, the underlying mechanism of the assortment is worth investigating.
Here, we address the existence of assortative mating and the generating mechanism of it in the Ryukyu Scops Owl Otus elegans interpositus, a species in which males are slightly smaller than females (Sawada, Iwasaki, Matsuo, et al., 2021). During the long-term (since 2002) monitoring of an isolated population of this taxon, data have been accumulated on the morphology, reproductive success, territories, and age of breeding pairs. The aims of this study are (1) to describe size assortative mating, (2) investigate the possible mechanisms contributing to the detected mating patterns, referring to previously documented three mechanisms: "like meets like," "become alike," and "mate choice." 2 | MATERIAL S AND ME THODS

| Material
Otus elegans interpositus is endemic to Minami-daito, a small, isolated, oceanic island in Japan (Ornithological Society of Japan, 2012). The population on the island consists of 200-300 pairs, and their breeding activity and survival history have been studied annually since 2002 (Sawada et al., 2019;Takagi et al., 2007;Takagi, 2020). The owls are monogamous and pair-bonds last, in most cases, until one of the pair dies (Akatani, 2011). Extra-pair copulation occurs, but is uncommon (Sawada et al., 2020). Pairs maintain their territories throughout the year and tend to use the same nest sites in successive years . Females lay a clutch of one to four eggs from mid-March to mid-May Sawada & Iwasaki, unpublished data;Takagi et al., 2007). The incubation and nestling periods each last about 1 month Sawada & Iwasaki, unpublished data;Takagi et al., 2007). Males carry food to their mates until the middle of the nestling period, and thereafter the parents share feeding duties (Murakami et al., 2022;. There is no significant sexual difference in annual survival rate (Sawada, Iwasaki, Inoue, et al., 2021). The average body mass of adult males and adult females are 88.4 and 92.2 g, respectively, showing slight reversed sexual size dimorphism (Sawada, Iwasaki, Matsuo, et al., 2021).

| Breeding monitoring
Since 2002 nests in natural cavities and nest boxes have been visited regularly to obtain data on breeding success. In this study, we have used data from 285 breeding attempts by 159 unique pairs consists of 138 individuals (some individuals bred multiple times), which were neither predated nor abandoned and for which we have complete data on the identity of the parents, egg laying data, and number of fledglings (Table S1). All chicks were ringed and measured, and blood samples were collected from them. Detailed field procedures have been described elsewhere Sawada et al., 2020;Takagi et al., 2007).

| Territory identification
All territorial owls on the island have been recorded as part of markrecapture (mark-resight) surveys since 2012 (Table S1). From late February to late July, TI (2012TI ( -2015 and AS (2016-2019) walked around the entire island using playback almost every night (from sundown to about midnight), except when it rained (see Sawada, Iwasaki, Inoue, et al., 2021). The coordinates of each encounter with territorial owls, along with identity and sex, were recorded.
Individuals were identified by unique combinations of colored reflective tape wrapped around metal leg rings (Takagi, 2020) using binoculars from a distance of about 10 m.

| Body measurements
Almost all breeding individuals (identified during breeding monitoring from 2002 onwards), and unmarked individuals (encountered during territory identification surveys from 2012 onwards) were captured by mist-net, ringed and measured. The measurements of 778 individuals are used in this study. Body mass (to the nearest 0.1 g) was measured using a Pesola spring balance or a digital was measured with a stainless-steel ruler. Measurements were made twice or more during each capture, allowing the use of mean values (see Table 1; Sawada, Iwasaki, Matsuo, et al., 2021). Since the correlations of these variables were weak, the values of the variables were used in the analysis as they were ( Figure S1). For the analysis of size assortative mating, we used the measurements of individuals that were confirmed as present from 2012 onwards, because randomization tests (see below) require detailed territory data which has only been available since 2012. However, for the analysis of reproductive success we have used measurements of individuals from

| Sex and age determination
Sex was determined by vocal characteristics, by the presence of a brood patch, or by PCR amplification of the Chromo Helicase DNA-binding gene (Fridolfsson & Ellegren, 1999;Sawada, Iwasaki, Matsuo, et al., 2021;Takagi, 2020). Age class (yearling or adult) was estimated from plumage characteristics following Sawada, Iwasaki, Matsuo, et al. (2021). Age, as used in the analyses below, refers to this dichotomous classification and not an exact age in years. In brief, if a bird meets two criteria out of three (pointed primaries, soft primary rachides, and worn secondaries), we judged the bird yearling.
Detailed procedures for sexing and aging the owls are described in Sawada, Iwasaki, Matsuo, et al. (2021).

| Statistical analysis
2.6.1 | Data standardization before analysis Before analysis, measurement data were statistically controlled for differences between measurers and years (Grant & Grant, 2008;Green, 2019)

| Fundamental analysis of assortative mating
To describe size assortative mating, we calculated Pearson's correlation between measurements of mated males and females.
By using the first measurements that were collected for each individual (some individuals were measured in multiple years), the effects of "become alike" were excluded as much as possible. The significance of the correlation was tested based on two methods, the cor.test function in R (hereafter, "parametric test"), and a randomization test (Erlandsson & Rolán-Alvarez, 1998). A parametric test was conducted because it is the commonest method to document assortative mating. A randomization test was conducted because the assumptions of the parametric test can be violated in the data of assortative mating (i.e. non-normality and/ or non-independence).
The procedures of the randomization test were similar to those described by Sawada et al. (2020): (step 1) Using a data matrix containing data for all territories in all years, males are randomly assigned to females within each year. Here, we randomly choose the same number of females as the actual pair data; (step 2) Calculate Pearson's correlation coefficient based on these simulated pairs; (step 3) Repeat processes in the step one and step two 1000 times; (step 4) Generate a distribution of correlation coefficients from these simulated values. This distribution is used as the null distribution of correlation coefficients expected under random mating in this owl population; (step 5) Obtain two tailed p-values as twice the proportion of simulated values, which are more extreme than the actual values. For the traits for which we found significant assortative mating by both the parametric and randomization tests (culmen length and wing length, see Section 3), we further investigated the generating mechanisms of the assortment by the analyses detailed in the following sections.

| Mechanism 1: Like meets like
We took two approaches to test whether mechanism 1 (like meets like) contributes to the assortment; first, statistical control of spatial and temporal accessibility of potential mates in the randomization test, second, testing whether the spatially accessible individuals were similar-sized or not and third temporally accessible individuals were similar-sized or not.
The basic premise of the first approach is that non-significance after controlling for mechanism 1 is indicative of contribution of mechanism 1 to the significance detected above (Erlandsson & Rolán-Alvarez, 1998). We modified step1 of the previously described randomization test so as to consider the spatial or temporal accessibility to potential mates.
The median dispersal distance of females is 1145 m (Matsuo, unpublished data; Sawada et al., 2019) so, to control for spatial accessibility, we randomly assigned a male within that distance of the focal female to that female. Then, the null distribution and p-value were calculated as the same way. If the distribution moves in the direction of the actual value of the correlation coefficient and the p-value increases, then mating with spatially more accessible mates would explain the size assortment. In this owl population, males settle before females and females exhibit roaming dispersal behavior indicating female's assessment of males (Sawada & Takagi, unpublished data).
Therefore, we consider that assignment of males to females reasonably mimics their pair formation process.
To control for temporal accessibility of potential mates, we randomly assigned males while considering the age of females and males.
There are three reasons for this treatment: (1) There is a tendency for age assortative mating (see Appendix 3; Table 2); (2) Yearlings tend to breed late in a breeding season, probably due to their late pair formation (see Appendix 3; Tables S4 and S5); (3) Yearlings and adults differ slightly in size (Sawada, Iwasaki, Matsuo, et al., 2021).
Let P YY , P YA , P AY , and P AA be the observed frequency of pairs (first and second subscripts denote age (yearling or adult), of males and females, respectively). To mimic the pair formation based on age, for yearling females, yearling males were assigned with the probability P YY and adult males were assigned with the probability P AY . For adult females, yearling males were assigned with the probability P YA and adult males were assigned with the probability P AA . Then, null distribution and p-value were calculated as the same way. If the distribution moves in the direction of the actual value of the correlation coefficient and the p-value increases, then mating with temporally more accessible, similar-aged mates, would explain the size assortative mating. We used the means of the observed frequencies of pairs from 2012 to 2019 as the values of P YY , P YA , P AY , and P AA ( Table 2).
To assess similarity among spatially accessible individuals, we described and tested spatial autocorrelation in body size by Mantel test (Mantel, 1967;Appendix 4). We focused on geographic distance and size difference between females and males because our interest is in whether spatially accessible males for females are similar to the females. This test was applied to all years (2012-2019) separately for culmen length and wing length. Holm's correction of p-value was applied to each trait.
We also investigated the heritability of culmen length and wing length. This was motivated by the fact that previous research has suggested that there is a spatially autocorrelated genetic structure (Sawada et al., 2019), and that if heritable components of body size variation exist, this may translate into the spatial heterogeneity in morphological variation. To estimate heritability, we applied parentoffspring regression to father-mother-offspring triads identified during breeding monitoring and territory mapping surveys (63 triads for culmen length and 61 triads for wing length  (Lynch & Walsh, 1998).
To assess similarity among temporally accessible individuals, we described size difference between individuals which settled at age class of yearling (early-settlers) and individuals which settled at age class of adult (late-settlers). Since we do not accurately know their breeding status (although they breed in most case), we refer to them "settlers," not "breeders." If both male early-settlers and female early-settlers have similar body size, similar sized individuals are likely to meet. Based on these considerations, culmen length and wing length of early-settlers (46 males and 20 females) and latesettlers (12 males and 17 females) were compared by t-test in each sex. Data were obtained from territory identification.

| Mechanism 2: Become alike
We took two approaches to test whether mechanism 2 (become alike) contributes to assortment. First we compared the difference in body size of paired individuals when first measured and when last measured. The latter measurement is expected to reflect any changes in body size accumulated after pair formation. If mechanism 2 works, then the difference between mates when last measured is expected to be smaller than when first measured. Furthermore, such change might be more pronounced in a labile trait such as wing length than in a less variable bony trait such as culmen length. We tested these expectations by paired t-test.
Second we compared correlation coefficients calculated from first measurements with those from last measurements. The statistical significance of the difference between the two correlation  Zou, 2007), with a strong positive correlation in last measurements suggesting that mechanism two does contribution to assortative mating in the owls.
On the other hand, there are some limitations in these approaches detecting the effect of "become alike." First the analyses do not account for body size change due to growth and senescence.
If body size shows bell-shaped change along their lifetime (e.g. body mass may increase at their young period but decrease in their old period), taking difference of just two measurements may not be able to detect precise pattern of "become alike." Second the analyses do not account for time span between the first measurement and the last measurement. Again, if the body size shows bell-shaped change along their lifetime, when the measurements were taken is important. Without the information, the analyses may miss the evidence of "become alike." Nevertheless, it is difficult to deal with these problems in our dataset, since exact age is unknown for most individuals.
Therefore, it should be noted that analyses for "become alike" are conservative in this study.

| Mechanism 3: Mate choice
We took three approaches to test whether mechanism 3 (mate choice) contributes to assortment. The first approach involved the statistical control of other mechanisms in the fundamental correlational analysis described above, based on the premise that persistent significant correlation, after controlling for other mechanisms, is indicative of contribution by mechanism 3 (Erlandsson & Rolán-Alvarez, 1998). Because correlation analysis using first measurement data already minimizes the effect of mechanism 2 (become alike), we considered to controlling for mechanism 1 (like meets like). The detailed procedures are the same as those described above, for testing mechanism 1.
The second approach consisted of an analysis of fitness components. The premise behind this is that, if there is active mate choice with respect to body size, then choosers are likely to benefit from this behavior (Andersson & Simmons, 2006). To test this, we evalu- and details are given in Appendices 6 and 7 and Table S6. The third approach was a comparison of body size between female early-settlers and female late-settlers. The premise behind this analysis is selection of males by females. As already mentioned, in this population, females show roaming dispersal pattern and settle in territories which are held by males. In addition, there are more males than females (Sawada, Iwasaki, Inoue, et al., 2021). For males, rejecting females visiting their territories may not be a good choice.
Therefore, female choice seems to have importance in this population. Then, females that disperse and settle earlier may have more potential mates to choose. If there are specific characteristics among early female settlers, then the advantage of specific females in acquiring specific males is suggested. Analysis is the same t-test used to assess similarity among temporally accessible individuals in the tests for "like meets like."

| Fundamental analysis of assortative mating
Parametric tests of correlations of body size measurements revealed that there was significant assortative mating with regards to culmen length, bill depth, bill width, head length, and wing length ( Table 3; Figure S2). Significant assortments in all traits remained after p-value corrections ( Table 3). Randomization tests revealed significant assortative mating with regards to culmen length and wing length ( Table 3; Figure 1). The assortment in culmen length even remained after p-value corrections ( Table 3).
Because null distributions generated by randomization of all traits, except culmen length and tail length, did not have means near zero ( Table 3; Figure 1), the significance of the parametric tests seemed to be overestimated. In subsequent analyses for generating mechanisms of assortment, we focused on culmen length and wing length as both tests identified significant assortment for these characteristics.

| Test of mechanism 1: Like meets like
Statistical control of spatial accessibility of potential mates hardly moved the null distribution toward actual values ( Figure 2). p-values were almost unchanged (Table S7, from p < .001 to .002 in culmen length, from p = .026 to .034 in wing length).
Statistical control of the age of potential mates hardly moved the null distribution toward actual values ( Figure 2). p-values were almost unchanged (Table S7, from p < .001 to .001 in culmen length, from p = .026 to .050 in wing length).
When comparing early-settlers and late-settlers, culmen length did not differ significantly in males (Figure 3a

| Test of mechanism 2: Become alike
For culmen length, differences in last measurements and first measurements did not differ significantly (Figure 5a, paired t-test, t = 0.354, df = 235, p = .724). However, for wing length, the difference in last measurements was significantly smaller than the difference in first-time measurements (Figure 5b, paired t-test, t = −3.192, df = 229, p = .002).
Correlation coefficients calculated from last measurements ( Figure S4) were not significantly different from the coefficients calculated from first measurements (

| Test of mechanism 3: Mate choice
Statistical control of spatial and temporal accessibility of potential mates did not completely cancel the significance of the correlation coefficients for culmen length and wing length, as described above ( Figure 2; Table S7).
The best model for number of fledglings was one that included year, male age, and male wing length (Table S9) Table S10, coefficient + SE = −0.02 ± 0.01, Z = −1.56, p = .12) although this was not significant. Applying model averaging to the best set of models, effect size and significance of male age and male wing length were almost unchanged (Table S11).
The best model for number of recruits was a model including year, egg laying date, male age, and male wing length (Table S12). From this model, the number of recruits in pairs involving yearling males was 0.62 times lower than pairs with adult males (Figure 6b; Table S13, coefficient ± SE = −0.47 ± 0.28, Z = −1.67, p = .10); 10 days late egg laying reduced the number of recruits by 0.85 times (Table S13, coefficient ± SE = −0.02 ± 0.01, Z = −1.63, p = .10), and a 10 mm reduction in male wing length increased the number of recruits 1.44 times ( Figure 6b; Table S13, coefficient ± SE = −0.04 ± 0.02, Z = −1.68, p = .09), although the effects were marginal. Applying model averaging to the best set of models, effect size and significance of male age and male wing length were almost unchanged (Table S14).

TA B L E 3 Results of fundamental correlation analyses
Survival rate was not significantly affected by culmen length in either sex ( Figure 6c; Table S15,   Early-settling and late-settling females did not differ significantly in culmen length and wing length, although early-settlers had slightly shorter wings than later-settlers in females as described above (Figures 3 and 4).

| DISCUSS ION
In this study of size assortative mating of the Ryukyu Scops Owls on Minami-daito Island, we found significant assortative mating for culmen length and wing length by two statistical approaches (parametric and randomization tests). Statistical control of spatial and temporal accessibility in the test did not cancel the assortment.
Females settling in their first year tend to have small wings, and males with short wings tend to have good fitness components. The differences in wing lengths of paired individuals were smaller later in their paired period than in their early paired period. We discuss possible generating mechanisms of assortative mating in this owl.

| Possible mechanism of assortative mating
For both culmen length and wing length, active mate choice (mechanism 3) is a likely explanation for the assortative mating in this owl population. There are three reasons for this. First, statistical control of spatial temporal accessibility did not cancel the assortment.
Second, Mantel test did not find significant similarity between females and potential mates. Third, early settlers and late settlers did not significantly differ in culmen length in both sexes. These three results indicate little contribution of mechanism 1 "like meets like" to the assortment. Fourth, our results already minimize the effect of mechanism 2 "become alike" by using first measurement data.
For the assortment with respect to culmen length, there was no further support of the interpretation above, because no fitness components correlated with the trait. However, for the assortment with respect to wing length, mate choice is further supported for two reasons. First, males with short wings have an advantage in reproduction, which is suggested by the fact that males with shorter wings produced slightly more fledglings and recruits in a single breeding attempt (albeit not significantly  Wing length (mm) advantages in acquiring territories or mates, indicating that shortwinged females have priority in accessing males with good reproductive performance.
Nevertheless, the contribution of mechanism 2 "become alike" cannot be completely ruled out for the assortment by wing length because of the small size difference after late pair formation. We used measurements made when we first captured individuals to replicate as closely as possible any correlation at the time of pair formation. However, we were unable to obtain measurements at the exact timing of pair formation. Thus we are faced by the limitation that, if the wing lengths of mated individuals become alike after pair formation, our first measurements may already have been after the mechanism "become alike" began operating, generating a positive correlation between males and females. However, because the Mantel test for spatial autocorrelation structure did not detect similarity in wing length between nearby individuals, sharing similar

(c)
Wing length Survival rate habitat may not lead to similarity between individuals indicating that "becoming alike" is unlikely.
Physical constraint (Crespi, 1989), a mechanism that we did not address in this study, also may not be ruled out. For some species of arthropod with long copulation time and or specific copulatory behavior, inefficiency in copulation between mates with a large size difference is suggested as a cause of size assortment (Han et al., 2010).
Compared with arthropods that may remain in contact for several hours (Crespi, 1989), the duration of avian copulation is very much shorter, lasting for only a few seconds, or at most several tens of seconds (Birkhead et al., 1987). Whether the physical constraint is important or unimportant in brief copulation by birds are unknown at present.

| Costs and benefits of body size
If short-winged males have good fitness components and shortwinged females can settle early, what costs and benefits produce this tendency? For males, short wings would have benefits in flying and hunting ability (Hakkarainen et al., 1996;Mueller, 1986) and have costs in physical fighting (McDonald et al., 2005). However, the owls rely on vocal contest in territory competitions and physical competitions are rarely observed (Bai & Severinghaus, 2012).
Therefore, benefits of short wings may outweigh costs for males.
For females, short wings would have benefits in hunting efficiency, again (Massemin et al., 2000) and have costs in breeding behaviors such as egg production and incubation (Krüger, 2005;Mueller, 1986). Importantly, these costs at breeding matter after settling, whereas the benefits of hunting efficiency matters even before settling. Therefore, at least before settling, benefits of short wings may outweigh costs also for females.

| Mechanisms to assess body size
A question we did not address in this study is, how do the owls know the body size of other individuals? Because of their nocturnality, owls rely heavily on vocal communication (König & Weick, 2008), and may perhaps use vocal characteristics to infer body size. In this owl, body size (tarsus length, culmen length, bill width, head length, tail length, body mass) correlates with hoot frequency (Takagi, unpublished data). Recording the behavioral responses to hoots at various frequencies would be a promising way to answer this problem in future research (Podos, 2010). Nevertheless, body size may also be assessed visually, and plumage characteristics may also be important (Galeotti & Rubolini, 2007).

| Taxonomic differences in the generating mechanisms of assortative mating
The relative contribution of previously proposed mechanisms for assortative mating may depend on taxonomic group. For arthropods and fishes, large males have an important advantage in competition for large fecund females (Crespi, 1989;de Almeida Borghezan et al., 2019;Taborsky et al., 2009). This implicitly assumes that female fecundity (number of eggs) increases with body size. However, some taxonomic groups, such as birds, do not conform to this assumption since their females produce far fewer eggs than either arthropods or fishes. Therefore, mate choice mechanisms that are not based on female fecundity may be more important, because the merits of competing for large females seems to be limited. Support for the "like meets like" mechanism actually exists (Hedenström, 1987;Indykiewicz et al., 2017). However, "mate choice" cannot be ignored. Catry et al. (1999) did a rare study into the causes of size assortative mating in birds (skuas and jaegers) which exhibit reversed sexual size dimorphism. They suggested that small males had an advantage in acquiring mates because females rejected large males. Therefore, male smallness, rather than female largeness, may be important for assortative mating in species with reversed sexual dimorphism.
Similarly, our study supports a mechanism whereby small females have an advantage in acquiring small males with good reproductive performance. This is a simple corollary from the traditional explanation of size assortative mating in which large males have an advantage in acquiring large females with good fecundity.

| CON CLUS ION
We have shown that size assortative mating, with respect to culmen length and wing length, occurs in the Ryukyu Scops Owl, and that mate choice is a possible mechanism contributing to the assortment.
Specifically, small females seemed to choose small males which are expected to give good reproductive outputs for the females. The background of that choice may be the benefit of being small in terms of flying and hunting ability. Since reports of size assortative mating in raptor species often only describe whether it occurs, future studies should focus on the causes of the assortment. Our understanding of size assortative mating has been constructed mainly based on organisms with non-reversed sexual size dimorphism, thus focusing also on those with reversed sexual size dimorphism will contribute to extending our understanding.  (Grant Number 17770019, 21570022, 16H04737, 19J12833 and 21J00958) and Tokyo Zoological Park Society.

CO N FLI C T O F I NTE R E S T
The authors declares that no conflicts of interest exist.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data will be archived at Dryad upon acceptance.

R E FE R E N C E S S U PP O RTI N G I N FO R M ATI O N
Additional supporting information can be found online in the Supporting Information section at the end of this article.

S TA N DA R D IZ ATI O N PRO CED U R E S
Before analysis, all measurement data were statistically controlled for differences between measurers and years to avoid false positive correlation which can occur by pooling heterogeneous measurement data. Using Sawada, Iwasaki, Matsuo, et al.'s (2021) estimates of the effect of measurers and years, all measurements were standardized to the measurement in 2019 by AS. For example, as culmen length measurements by TI and AS were estimated to be −0.04 and 0.21 mm larger than by KA, 0.21 − (−0.04) = 0.25 mm was added to measurements by TI so that the value obtained by him to take the same mean as the measurements by AS ( Figure S5). Effect of year was adjusted in the same manner. Values used for this adjustment are given in Tables S2 and S3 of this study and also in the supplementary tables in Sawada, Iwasaki, Matsuo, et al. (2021). If such the standardizations are not applied, correlation coefficients calculated from the raw data become inflated. Reason why such inflation occurs is explained by Figure S6. Before the standardizations, data scatter due to effect of measurer. Therefore, if we calculate correlation coefficient ignoring the effect of measurer, absolute value of the coefficient becomes large due to the long-stretched distribution ( Figure S6a). However, such the effect can be controlled for by standardizing the effect of measurer a priori ( Figure S6b).

A PPE N D I X 2 DATA M AT R I X
Combining the measurement data and data from breeding monitoring, territory mapping, and sex and age determination, five data frames (A-E) were generated on R so that the subsequent analyses could immediately use the data ( Figure S7). Data matrix E contains the measurement data for father-motheroffspring triads. As these data were used for parent-offspring regression, and therefore there should be one measurement value for each individual, mean values across years were used for measurement data in the matrix. These data were used for analyses of parent-offspring regression.

A PPE N D I X 3 CO NTRO LLI N G FO R TE M P O R A L ACCE SS I B I LIT Y
To control for temporal accessibility of potential mates in randomization tests, we randomly assigned males while considering the age of females and males. There are three reasons for this treatment: (1) There was a tendency for age assortative mating (see below); (2) Yearlings tend to breed late, probably due to their late pair formation (see below); (3) There is a slight size difference between yearlings and adults (Sawada, Iwasaki, Matsuo, et al., 2021).
To confirm age assortative mating, we applied Fisher's exact test on a cross table which contains frequencies of pairs of (1) yearling male and yearling female, (2) yearling male and adult female, (3) adult male and yearling female, and (4) adult male and adult female. The test was conducted for yearly data and for pooled data.
There were significant age-assortative matings in some years ( We included these random effects because our data contained multiple data from the same parents. Varying the combination of random effects, a model with Mother ID gave the smallest AIC value (Table S4). Therefore, we used GLMM models with Mother ID in the subsequent analyses.
Then, we searched for best combination of fixed effects in terms of AIC using the dredge function in the MuMIn package. Because the best model gave an AIC value which was much smaller than the second model (ΔAIC = 6.17, Table S4), we interpreted the results without considering model averaging.
The best model for egg laying date included significant effects of male age and female age (Table S4). From this model, egg laying date in pairs with yearling males was 2.97 days later than pairs with adult males (Table S5, coefficient ± SE = 2.97 ± 1.31, t = 2.27, df = 257.24, p = 0.02) and egg laying date in pairs with yearling females was 4.79 days later than pairs with adult females (Table S5, coefficient ± SE = 4.79 ± 1.53, t = 3.13, df = 274.23, p = 0.00).

A PPE N D I X 4 M A NTEL TE S T
To assess similarity among spatially accessible individuals, we described and tested spatial autocorrelation in body size by Mantel test. We focused on geographic distance and size difference from females to males because our interest is whether spatially accessible males for females are similar to the females. Therefore, input matrixes were non-square n f × n m matrix X and Y, where n f and n m are the number of females and males, respectively. Elements of the matrixes x ij and y ij are the geographic distance and absolute size difference between the corresponding female i and male j. Note that this is not a common setting for Mantel test because the test is often applied to a square matrix to ask, for example, whether "males" are similar to spatially neighboring "males." Here, our question is whether "females" are similar to spatially neighboring "males." We followed the original description of the test (Mantel, 1967) to apply the test to non-square matrixes. Test statistic M was defined as ∑∑x ij y ij (summation is taken over all i and j). Null distribution of M and corresponding p-value were obtained by permutation test (1000 permutations).
This test was applied to all years (2012-2019) separately for culmen length and for wing length.

A PPE N D I X 5 TE S T O F D I FFE R E N CE B E T WE E N T WO CO R R E L ATI O N CO EFFI CI ENTS
We compared correlation coefficients at first and last measurements. Statistical significance was based on Z test using Fisher's Z transformation of correlation coefficients.
Let z before and z after be Fisher's Z transformation of correlation coefficients which are calculated from the first and last measurements, respectively. Test statistic Z and p-value were calculated by the following formula: Here, n after and n before are the sample size. In this study they are identical and written as n.

A PPE N D I X 6 N U M B E R O F FLE D G LI N G S
Using data matrix C, we evaluated the effects of body size on number We included the random effects because our data contained multiple data from the same parents. However, variance estimates in relation to the random effects were zero and, the model without the random effect gave a smaller AIC value than the model with the random effects (Table S6). Therefore, we dropped the random effects from subsequent analyses and used GLM instead of GLMM.
Then, we searched for the best combination of fixed effects in terms of AIC using the dredge function in the MuMIn package.
Top-ranked models with ΔAIC <2 (difference from minimum AIC smaller than 2) were model averaged by model.avg to obtain model averaged regression coefficients and corresponding p-values.
Specifically,"subset" coefficients in the output of model.avg were used for interpretation.

A PPE N D I X 7 N U M B E R O F R ECRU IT S
Using data matrix C, we evaluated the effects of body size on the We included these random effects because our data contained multiple data from the same parents. However, variance estimates in relation to the random effects were zero, and the model without the random effects gave a smaller AIC value than the model with the random effect (Table S6). Therefore, we dropped the random effects from subsequent analyses and used GLM instead of GLMM.
Then, we searched for the best combination of fixed effects in terms of AIC using the dredge function in the MuMIn package. Topranked models with ΔAIC <2 (difference from minimum AIC smaller than 2) were model averaged by model.avg to obtain model averaged regression coefficients and corresponding p-values. Specifically, "subset" coefficients in the output of model.avg were used for interpretation.